Knowledge Base Population for Organization Mentions in Email

نویسندگان

  • Ning Gao
  • Mark Dredze
  • Douglas W. Oard
چکیده

A prior study found that on average there are 6.3 named mentions of organizations found in email messages from the Enron collection, only about half of which could be linked to known entities in Wikipedia (Gao et al., 2014). That suggests a need for collection-specific approaches to entity linking, similar to those have proven successful for person mentions. This paper describes a process for automatically constructing such a collection-specific knowledge base of organization entities for named mentions in Enron. A new public test collection for linking 130 mentions of organizations found in Enron email to either Wikipedia or to this new collection-specific knowledge base is also described. Together, Wikipedia entities plus the new collectionspecific knowledge base cover 83% of the 130 organization mentions, a 14% (absolute) improvement over the 69% that could be linked to Wikipedia alone.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Test Collection for Email Entity Linking

Most prior work on entity linking has focused on linking name mentions found in third-person communication (e.g., news) to broad-coverage knowledge bases (e.g., Wikipedia). A restricted form of domain-specific entity linking has, however, been tried with email, linking mentions of people to specific email addresses. This paper introduces a new test collection for the task of linking mentions of...

متن کامل

Wikipedia and the Web of Confusable Entities: Experience from Entity Linking Query Creation for TAC 2009 Knowledge Base Population

The Text Analysis Conference (TAC) is a series of Natural Language Processing evaluation workshops organized by the National Institute of Standards and Technology. The Knowledge Base Population (KBP) track at TAC 2009, a hybrid descendant of the TREC Question Answering track and the Automated Content Extraction (ACE) evaluation program, is designed to support development of systems that are cap...

متن کامل

MSRA at TAC 2011: Entity Linking

The Knowledge Base Population task aims at advancing the state of the art for systems that automatically discover information about named entities and then incorporate this information in a knowledge source. The overall task of populating a knowledge base is decomposed into two related tasks: Entity Linking, where names must be aligned to entities in the KB, and Slot Filling, which involves min...

متن کامل

Overview of TAC-KBP2014 Entity Discovery and Linking Tasks

In this paper we give an overview of the Entity Discovery and Linking tasks at the Knowledge Base Population track at TAC 2014. In this year we introduced a new end-to-end English entity discovery and linking task which requires a system to take raw texts as input, automatically extract entity mentions, link them to a knowledge base, and cluster NIL mentions. In this paper we provide an overvie...

متن کامل

DBpedia based Ontological Concepts Driven Information Extraction from Unstructured Text

In this paper a knowledge base concept driven named entity recognition (NER) approach is presented. The technique is used for information extraction from news articles and linking it with background concepts in knowledge base. The work specifically focuses on extracting entity mentions from unstructured articles. The extraction of entity mentions from articles is based on the existing concepts ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016